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A  METHOD  FOR  COMPARING  DIGITAL  XBT  COMPRESSION  TECHNIQUES 

by 

Richard  F.J.  Winterburn 


ABSTRACT 


e.ffe 


n  i  ■  ■  s  *Us  C,t.f  1 

,  a  b  ■£'  <  ‘v  j  r 


The  compression  of  digitally  recorded *J(BT)  profile  data  Is  a  necessary 
pre-requisite  to  their  use  in  many  applications.  A  method  of  comparing 
different  compression  techniques  using  a  UNIVAC  1106  computer  Is  presented. 
By  applying  three  such  techniques  to  over  200  XBT  profiles  recorded  during 
various  cruises  of  the  R.V.  MARIA  PAOLINA  G. ,  a  comparison  of  the  results 
has  identified  the  optimum  method,  which  has  now  been  adopted  for  XBT  data 
compression  at  SACLANTCEN. 


INTRODUCTION 


With  the  acquisition  of  digital  XBT*  data  into  the  UNIVAC  1106  System  [  1  ] 
and  the  subsequent  development  of  a  comprehensive  software  package  to  edit 
the  profiles  interactively  [2],  it  has  become  necessary  to  examine  various 
means  of  compressing  the  data  to  a  more  manageable  size.  This  Is  parti¬ 
cularly  necessary  when  it  is  required  to  insert  the  XBT  profile  Into  the 
SMODS**  Data  Base  [3]  (which  would  allow  use  of  the  available  display/ 
analysis  software  [4])  since  the  maximum  allowable  length  is  125  DA 
(depth/temperature)  pairs. 

This  memorandum  describes  a  software  comparator  designed  to  allow  the 
effect  of  any  filter/compression  algorithm  on  an  XBT  profile  to  be  compared 
with  that  of  another.  It  also  presents  results  of  a  comparison  of  three 
such  techniques  applied  to  the  XBT  data  of  two  SACLANTCEN  cruises  In 
1977/78  [5,6]. 


1  METHOD 

Figure  1  gives  a  simplified  flow-chart  of  the  steps  of  the  comparison 
program.  In  essence  the  program  inputs  a  number  of  XBT  profiles,  subjects 
each  of  them  to  a  number  of  compression  techniques,  and  stores  the  results 
for  a  later  comparison  of  the  overall  effect  of  each  technique.  As  can  be 
seen,  the  program  contains  two  nested  loops;  L00P1  is  repeated  for  each 
filter  and  L00P2  Is  repeated  for  each  profile.  These  Iterations  are 
followed  by  the  "best-effect"  statistical  computations.  In  this  wpy  the 
I/O  (Input/output),  filtering,  and  comparison  sections  are  completely  Inde¬ 
pendent  of  one  another,  allowing  easy  modification  of  the  software  to 
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Include  new  filters,  compression  techniques,  output  displays,  etc.  as  they 
are  required. 

At  step  B  there  Is  a  choice  of  three  techniques  of  coiapresslon,  each  of 
which  Is  described  In  detail  In  Ch.  2.  The  output  from  this  step  nay  be 
either  a  filtered  profile  or  a  compressed  profile.  If  It  Is  the  fomer 
then  It  Is  linearly  Interpolated  onto  a  fixed  Increment  depth  scale  as  a 
means  of  compression.  For  the  present  tests  this  has  been  a  5  m  scale 
within  the  depth  limits  of  the  filtered  profile;  If  however  It  Is  the 
latter,  no  such  Interpolation  Is  carried  out. 

The  comparison  Itself  Is  made  by  first  linearly  Interpolating  each  com¬ 
pressed  profile  to  the  same  depths  as  the  original  profile  (step  C)  and 
then  conducting  a  statistical  analysis  of  the  temperature  variability  at 
these  depths  between  each  original  value  and  Its  compressed  profile  value; 
this  Is  described  In  detail,  with  examples  In  Ch.  3. 


2  THE  COMPRESSORS 

Three  compression  techniques  have  been  compared,  viz: 

Method  1  :  LANCZOS  filter  and  Interpolation. 

Method  2  :  Significant  point  selection. 

Method  3  :  Smoothing  polynomial  and  Interpolation. 

Of  these.  Methods  1  and  3  are  digital  filters  followed  by  discrete 
sampling,  whereas  Method  2  Is  an  objective  selection  technique  yielding 
directly  a  compressed  profile.  Each  of  these  will  now  be  described  In 
detail. 


2. 1  Lanczos  Filter 

The  XBT  data  is  originally  digitized  at  a  time  Interval  equivalent  to  a 
depth  Increment  of  0.6  i  and,  after  filtering,  the  smoothed  data  will  be 
sampled  at  5  m  Increments.  It  follows  therefore  that  for  any  filter  to  act 
on  this  data,  the  nyqulst  frequency,  which  Is  expressed  as  the  reciprocal 
of  twice  the  time  Interval  between  successive  observations  of  an  equally- 
spaced  time  series.  Is  given  by 

'  '•  *  'fire"  °-833 

(l.e.  1  ny.qui st. =.0,833  cygles/ntn)  .  .  .. 

and 

'cutoff -  ,c  - -fe  eyew. 

=  0. 12  nyqulst 
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FIG.  1  GENERAL  FLOW-CHART 
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The  Lanczos  filter  [7,8,9,10]  Is  e  low-pass  gausslan  filter;  as  Its  weights 
sun  to  unity  It  ensures  that  there  will  be  no  change  In  either  the  phase  or 
the  nean  value  of  the  series.  However,  It  Is  a  penalty  of  such  a  filter 
that  the  first  and  last  m  constituents  (m  =  the  number  of  side  weights)  of 
the  series  must  be  discarded. 

In  the  case  of  XBT  data,  It  has  been  recognized  [11]  that  the  first  4  n  of 

the  profile  are  normally  unreliable  due  to  problems  of  the  response  of  the 

sensor  to  the  air/sea  temperature  difference  at  launch  time.  This  cor¬ 
responds  to  the  first  six  scans  of  the  digitization  and  therefore,  In  order 
to  retain  as  much  of  the  profile  as  possible,  the  filter  characteristics 
have  been  computed  over  13  weights,  i.e.  a  central  principal  weight  and  six 
matched  pairs  of  side  weights.  A  compromise  must  always  be  made  here,  as 
the  number  of  weights  directly  influences  the  slope  of  the  gain  of  the 

filter;  If  the  number  of  weights  is  increased,  although  the  slope  of  the 

gain  would  increase,  the  number  of  data  points  at  the  limits  of  the  series 
to  be  discarded  would  also  Increase. 

The  weights  of  the  Lanczos  filter  have  been  computed  by  [9]: 

sj  »  Vj  *  - ^-±1 - j  =  0,1, ...6 

ii  a  o 


where  j  =  the  weight  Identifier 

n  =  the  number  of  side  terms 

v  ,  fc)  1n  wuists) 

i  j2it2  fc/n 

which,  substituting 

fc  *  0.12 

m  =6, 

gives 

v  .  s1n(0. 524 j)*  s1n(0. 377 j)  j  =  0.lf...6 

j  0. 2j2 

from  which  the  central  and  right-hand  side  weights  have  been  computed  as 
shown  In  Table  1,  and,  using  the  computer  program  of  Pesaresl  [10],  the 
amplitude  response  as  a  function  of  frequency  has  been  plotted  (Fig.  2). 
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TABLE  1 

CENTRAL  AND  RIGHT  HAND  SIDE  WEIGHTS  OF  APPLIED  LANCZOS  FILTER 


j  1 

sj 

0 

0.1676 

1 

0.1543 

2 

0.1165 

3 

0. 0841 

4 

0.0453 

5 

0.0160 

6 

0. 0000 

Therefore,  with  this  filter,  the  first  and  last  6  points  (i.e.  3.6  m)  of 
the  profile  are  discarded;  the  resultant  profile  interpolated  at  5  in  depth 
increments,  yields  a  compression  to  approximately  12%  of  the  original. 


2.2  Significant  Point  Selection 

This  is  an  objective  selection  technique  that  examines  each  point  on  the 
trace  and  decides,  within  certain  criteria,  if  the  point  may  be  classified 
as  redundant.  This  is  carried  out  as  follows.  The  first  point  on  the 
profile  is  always  accepted,  then  for  each  subsequent  point  (see  Fig.  3), 
given  that  A  is  the  last  point  to  have  been  retained  and  that  points  B,  C, 
and  D  have  been  classified  as  redundant,  the  problem  is  to  determine  If 
point  E  should  be  retained.  The  two  parameters  chosen  to  control  this 
decision  are: 

(a)  the  gradient  change  at  the  point 

(b)  the  deviation  from  the  original  trace  of  the  straight-line 
segment  created  by  discarding  the  point  (a  on  Fig.  3). 
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The  user  of  the  routine  has  to  input  a  minimum  allowable  value  (parameter 
D2DIV)  that  the  change  in  gradient  must  exceed  to  classify  the  point  as  a 
significant  point.  If  this  occurs,  then  the  point  is  retained.  However, 
in  the  case  of  a  curve  where  the  gradient  change  does  not  exceed  this  value 
(e.g.  Fig.  3),  before  the  point  is  discarded  the  second  criteria  is  used. 
This  checks  if  the  deviation  of  the  resultant  line  segment  (a)  is  greater 
than  a  user-defined  limit  (parameter  DELTAT),  and  the  point  is  discarded 
only  if  this  limit  is  not  exceeded. 

This  criteria  of  deviation  from  the  original  record  is  an  internationally- 
accepted  standard;  the  IOC  manual  on  international  oceanographic  data 
exchange  [12]  states  that  for  XBT  data 

“...flexure  points  determined  in  such  a  way  that  linear  inter¬ 
polations  fall  within  ±  0.2°C  of  the  original  record." 

Thus  in  Fig.  3,  the  point  under  consideration,  E,  will  be  discarded  only  if 


and 

a  <  DELTAT. 

This  algorithm  has  been  tested  on  over  200  XBT  profiles,  using  a  permuta¬ 
tion  of  fiftenn  combinations  of  D2DIV  and  DELTAT. 

The  results  of  using  D20IV  incremented  by  0.1  from  0.1  to  0.3,  and  D2DIV 
incremented  by  0.010  from  0.005  to  0.045,  are  shown  in  Table  2,  which  for 
each  combination  of  D2DIV  and  DELTAT  gives  the  standard  deviation  of  the 
error,  the  maximum  error,  and  the  mean  percentage  of  points  remaining  after 
the  compression  has  been  carried  out.  From  these  results,  the  combination 
of  D2DIV  =0.2  and  DELTAT  =  0.035°C  (outlined  in  heavy  black)  has  been 
selected  as  the  most  effective  in  compressing  the  profile  to  less  than  125 
data  points  (a  SMODS  data  base  requirement  [3])  and  yet  having  a  maximum 
error  close  to  the  absolute  accuracy  of  the  instrument.  Figure  4  shows  the 
effect  of  increasing  the  value  of  DELTAT  while  compressing  the  same  ori¬ 
ginal  profile,  the  number  at  the  bottom  of  each  profile  being  the  number  of 
constituent  data  points  after  each  compression. 


TABLE  2  SIGNIFICANT  POINT  METHOD  TEST  RESULTS 


STANI 

>ARD  DEVIATION 

PERCENT  REMAINING 

MAXIMUM  ERROR 

\02  01 V 
DELTAT\ 

0.1 

0.2 

0.3 

0.1 

0.2 

0.3 

0.1 

0.2 

0.3 

0.005 

0.000 

0.009 

0.009 

31.59 

35.98 

35.92 

0.05 

0.12 

0.10 

0.015 

0.000 

0.009 

0.009 

31.58 

35.76 

35.70 

0.05 

0.12 

0.10 

0.025 

0.001 

0.010 

0.010 

31.42 

33.00 

32.83 

0.05 

0.12 

0.10 

0.035 

0.001 

0.022 

0.022 

31.37 

14.78 

14.57 

0.06 

0.17 

0.14 

0.045 

0.001 

0.023 

0.023 

31.37 

13.30 

13.02 

0.06 

0.17 

0.14 

1 


7 


X  REMAINING 


Ho 


11-00 


iS-oo 


11.00 


is. oo  is. oo  iV.oo 

ItM°ER*TUR£ 


18-00 


TR 


is: 


00 


*»  00 


is. 


00 


FIG.  4  SIGNIFICANT  POINT  SELECTION 

Effect  of  increasing  value  of  parameter  DELTAT 
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FUNCTION  OF  ORIGINAL  NUMBER  OF  POINTS 


SM-131 


The  rate  of  compression  with  this  method  varies  with  the  complexity  of  the 
profile  but  in  general  approximates  14%  of  the  original.  This  percentage 
reduction  increases  with  the  number  of  data  points,  as  a  result  of  the 
simplified  nature  of  the  temperature  profile  with  an  increase  in  depth, 
requiring  fewer  data  points  to  be  retained.  This  is  clearly  shown  on 
Fig.  5  where  the  percentage  of  points  remaining  is  plotted  as  a  function  of 
the  number  of  original  points  for  222  profiles. 

The  number  of  points  retained  by  fixed-level  interpolation  is  of  course 
directly  related  to  the  sampling  frequency  i.e.  as  is  stated  in  Sect.  2.1, 
if  a  sampling  rate  of  5  m  is  required,  the  percentage  reduction  is  12%. 
This  varies  slightly,  as  there  may  be  additional  data  points  above  and 
below  the  5  m  start  and  finish  levels.  This  12%  reduction  level  is  shown 
on  Fig.  5  together  with  the  mean  value  of  the  significant  point  reduction 
i.e.  14.78%. 


2. 3  Smoothing  Polynomial 

This  is  a  direct  application  of  the  UNIVAC  library  routine  MOVAVG  [13], 
which  fits  a  second  order  polynomial  over  a  user-defined  point  extent. 

For  the  purposes  of  this  comparison,  a  6-point  extent  was  chosen  (i.e.  the 
smoothing  formula  goes  from  i-6  to  i+6  for  the  ith  element  of  the  profile) 
to  maintain  compatibility  with  that  used  in  Method  1,  and  the  first  and 
last  six  points  discarded  after  the  smoothing.  Thus,  after  the  5  m  inter¬ 
polation  is  completed,  a  compression  to  approximately  12%  is  also  achieved. 


3  THE  COMPARISON 

The  program,  having  filtered  and/or  compressed  a  profile  to  less  than  125 
points,  carries  out  a  linear  interpolation  at  the  same  depths  as  the 
original  trace,  i.e.  at  approximately  0.6  m  depth  increments.  From  this 
the  value 

td  ~  torig  *  tfilt 

is  computed  at  each  depth,  and  the  mean,  maximum,  and  standard  deviation  of 
Tp  for  each  trace  with  each  method  is  stored,  together  with  the  percentage 
number  of  points  of  the  original  profile  remaining.  Table  3  is  an  example 
of  the  computer  report  at  this  stage  and  shows  these  results  for  each 
method  [Lanczos,  Significant-Point  (SIGNIF),  and  polynomial  (POLY)],  using 
individual  BT  profiles  identified  by  a  BT  number  (BTNO)  and  giving  their 
original  number  of  points  (NPT). 

When  the  compression  of  all  the  profiles  is  complete,  a  comparison  is 
carried  out  to  identify  the  most  successful  method  in  terms  of  least  value 
of  the  mean,  the  maximum  standard  deviation,  and  the  percentage  of  points 
remaining.  This  is  output  for  all  profiles  (Table  4  is  an  example)  for 
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each  method  as  a  matrix  on  which  the  numbers  signify  the  three  methods: 

1  Lanczos  filter 

2  Significant  point  selection, 

3  Polynomial  fitting. 

These  results  are  summed  for  each  statistical  test  and  the  results  output 
both  as  a  total  for  each  method  and  as  a  percentage  of  the  total  number  of 
profiles  under  examination. 

The  results  of  the  individual  filters  are  then  examined  and  average  values 
of  the  mean,  the  standard  deviation,  and  the  percentage  remaining  are 
computed. 


4  APPLICATION 

In  1977/78  the  SACLANTCEN  Oceanographic  group  carried  out  two  prolonged 
cruises  in  the  Gulf  of  Cadiz  [5]  and  Alboran  Sea  [6],  during  which  a  total 
of  222  XBT  measurements  were  made  using  SACLANTCEN1 s  on-board,  on-line 
Oceanographic  Oata  Acquisition  System.  These  data  have  subsequently  been 
transferred  to  the  SACLANTCEN  UNIVAC  1106  computer  system  [1]  and  cleaned 
and  edited  by  an  in-house-developed  interactive  editing  package  [2]. 
Finally,  these  data  needed  to  be  inserted  into  the  SMODS  data  base  [3]  and 
for  this  reason  their  compression  was  essential. 

These  data  could  therefore  be  used  to  test  the  various  compressions  now 
available,  as  they  include  both  deep  (>  500  m)  and  shallow  (<  500  m)  casts 
with  complex  water  columns  (e.g.  Atlantic/Mediterranean  water  masses). 

The  comparator  has  been  exercised  on  these  data  and  the  most  effective 
method  identified.  Table  5  gives  the  results  of  the  test  for  "best  effect" 
(as  explained  in  Ch.  3)  expressed  in  totals  for  each  method  under  each 
test.  These  totals  are  also  shown  in  Table  6  where  they  are  expressed  as 
percentiles  of  222. 

From  these  it  is  clear  that  in  terms  of  least  maximum  error  and  standard 
deviation  of  the  error,  Method  2,  the  significant  point  selection,  is  the 
most  effective.  Although  both  the  Lanczos  filter  and  the  significant  point 
methods  are  equally  effective  at  reducing  the  traces  to  the  least  number  of 
points,  the  significant  point  method  brings  the  profiles  within  the  SMODS 
data  base  limit  of  125  data  points  and  is  therefore  acceptable. 

Table  7,  which  gives  the  average  values  over  the  222  profiles  of  the  error 
parameters  for  each  method,  shows  that  the  average  maximum  error  of 
0.0735°C  is  well  within  the  IOC  requirement  of  0.2°C. 

As  a  result  of  these  tests,  the  significant-point  selection  method  has  been 
adopted  as  the  recommended  method  of  compression  for  the  SMODS  data  base 
systems.  To  maintain  uniformity,  and  also  to  reduce  to  a  minimum  the 
complexity  of  all  future  SACLANTCEN  digital  XBT  transfer,  only  data  passing 
through  this  system  will  be  allowed  entry  into  the  data  base. 
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TABLE  5  * BEST  EFFECT 1  TOTALS  FOR  EACH  METHOD 


LOWEST 

LOWEST 

LOWEST 

LOWEST 

METHOD 

MEAN 

MAXIMUM 

ST. DEVIATION 

PERCENTAGE 

DIFFERENCE 

DIFFERENCE 

OF  DIFFERENCE 

REMAINING 

1 

88 

12 

6 

110 

2 

29 

202 

146 

112 

3 

105 

8 

70 

0 

TABLE  6  PERCENTAGE  OF  ' BEST  EFFECT'  FOR  EACH  METHOD 


METHOD 

LOWEST 

MEAN 

DIFFERENCE 

LOWEST 

MAXIMUM 

DIFFERENCE 

LOWEST 

ST. DEVIATION 

OF  DIFFERENCE 

LOWEST 

PERCENTAGE 

REMAINING 

1 

39.64 

5.41 

2.70 

49.55 

2 

13.06 

90.99 

65.77 

50.45 

3 

47.30 

3.60 

31.53 

_ 

0.0 

_ 

TABLE  7  AVERAGE  VALUES  OF  STATISTICAL  TEST  PARAMETERS 

LOWEST  LOWEST  LOWEST  LOWEST 

METHOD  MEAN  MAXIMUM  ST. DEVIATION  PERCENTAGE 


DIFFERENCE  DIFFERENCE  OF  DIFFERENCE  REMAINING 
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SUMMARY  AND  CONCLUSIONS 

1.  A  flexible  software  comparator  has  been  developed  to  study  the 
effects  of  different  filter/compression  algorithms  on  digital  XBT-proflle 
data.  The  software  will  allow  other  algorithms  to  be  Included  In  the 
future  as  they  are  developed. 

2.  The  comparator  has  been  tried  and  tested  on  recently- acquired  XBT 
data  and  the  error  limits  of  the  system  determined. 

3.  A  recommended  standard  method  of  compression  has  been  selected 
for  entry  of  digital  XBT  data  Into  the  SACLANTCEN  Oceanographic  Data  Base. 

4.  The  comparator  will  be  extended  in  the  near  future  to  look  at  the 
effects  on  digital  STO/CTO  data  for  their  compression  and  data-base  entry. 
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